Pseudoknots in a Homopolymer 
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After a discussion of the definition and number of pseudoknots, we reconsider the self-attracting 
homopolymer paying particular attention to the scaling of the pseudoknot number (N p k) at different 
temperature regimes in two and three dimensions. We find that, although the total number of 
pseudoknots is extensive at all temperatures, the number of those forming between the two halves of 
the chain diverges logarithmically at (both dimensions) and below (2d only) the S-temparature. We 
later introduce a simple model that emphasizes the role of pseudoknot formation during collapse. 
The resulting phase diagram involves swollen, branched and collapsed homopolymer phases with 
transitions between each pair. 

PACS numbers: 36.20.Ey, 87.15.Aa, 05.40.-a 



I. INTRODUCTION 

Fueled mainly by the advance in RNA structure de- 
termination techniques, recently there has been grow- 
ing interest in understanding and predicting formation of 
pseudoknots in RNAs. A pseudoknot (PK) is not a true 
knot in the conventional sense. It is a simpler construct 
generated by a polymer's self-contacts (see the definition 
below), therefore is encountered more frequently. Unlike 
true knots which are problematic for the DNA and occur 
occasionally in shorter biomolecules, PKs are the tertiary 
structure of the folded RNAs. They are known to exist 
in almost all RNA classes including transfer, messanger, 
ribosomal, viral, catalytic and self-splicing RNAs (see re- 
views 0). A recent analysis found that they account for 
up to 30% of the bound base pairs in G+C rich RNA 
sequences Q. In addition to stabilizing the fold, PKs 
are believed to assume functional roles, such as mediat- 
ing the binding of the proteins they encode 0], labeling 
functionally importantpositions on the coding regions of 
the mRNA sequence |5j , mediating frameshifting 
etc. 

Being a more elementary topological formation than 
knots, PKs are relatively amenable to numerical investi- 
gation. Nevertheless, most of the earlier computational 
tools and recent theoretical work on RNA structure pre- 
diction take into consideration only those configurations 
without PKs 0) B 13- This is mostly because ignor- 
ing PKs results in a 'nested' set of equations and, as a 
consequence, allows efficient dynamic programming tech- 
niques. The drawback is that their success is limited to 
secondary structure prediction only. And even then, with 
limited accuracy due, partially, to a necessary reorgani- 
zation of the secondary structure contacts to accomo- 
date the PKs. More recently there appeared computa- 
tional 0, [n| and theoretical 0, 0, 0] studies that 
include PKs into RNA structure prediction algorithms. 
Pilsbury et. al. suggest a diagrammatic expansion to per- 
turbatively take into account the PKs |l2(. A recent 
study by Baiesi et.al. [13| takes a more physical look 
on RNA denaturation. They model the RNA as a ho- 



mopolymer traversing a two-tolerant walk on the FCC 
lattice and consider walks both with and without PKs 
[l6|. They conclude that the sharp second-order denat- 
uration transition observed when PKs are allowed gives 
way to a smooth crossover upon their exclusion. This 
result emphasizes the thermodynamic relevance of PKs. 
Lucas et.al. [hH ] consider lattice homopolymers again 
to argue that the denaturation transition between the 
pseudoknotted state and the open state is continuous. 
Another two-tolerant trail model with pseudoknots and 
with a native state consisting of a single hairpin is ar- 
gued to denaturate through a first-order transition |l5j| . 
Studying the interplay between the PKs and the transi- 
tion thermodynamics within the homopolymer context is 
a prerequisite for a deeper understanding of the physics 
of RNA pseudoknots. 

Folding experiments on RNAs suggest that it is phys- 
ically more appropriate to attribute a different binding 
energy to the contacts that form a PK 17]. This en- 
ergy can be tuned by changing the Mg +2 concentration 
in the solvent. Unless this energy is prohibitively high (as 
in the case of very low Mg +2 concentration), calculating 
the Boltzmann weights necessitates identifying PKs for 
an arbitrary configuration. As we shall see below, this 
task, though may be easier in native RNA configurations, 
is nontrivial for an arbitrary polymer. 

Accordingly, our goal throughout this paper will be to 
explore the thermodynamic role assumed by the PKs in 
the well-known context of homopolymer collapse by: 

1. providing an analytic definition for the PK number 
(Section II), 

2. investigating the scaling properties of the PK number 
in various regimes of the homopolymer collapse (Section 
HI), 

3. generalizing the Hamiltonian for the self-attracting 
homopolymer to include an arbitrary penalty for PKs 
and obtaining the corresponding phase diagram (Section 
IV). 

We hope that our results will provide a better under- 
standing to the nature of the PKs and a new perspective 
to the homopolymer collapse transition by locating it in a 
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more general framework that also includes the branched 
polymers and their collapse transition. 

II. COUNTING PSEUDOKNOTS 

An arbitrary configuration of a polymer chain can be 
encoded as a contact map |18| which is a binary sym- 
metric matrix. The contact map, in turn, can be repre- 
sented graphically by an 'arc diagram' as follows: Imag- 
ine stretching the polymer into a horizontal straight line 
by pulling from the two ends. Next, connect each pair of 
monomers that were originally in contact by a semicir- 
cular arc on the upper half plane (the diagram is drawn 
on a plane even though the polymer may be embedded 
in arbitrary dimensions) . 




FIG. 1: A sample SAW on a square lattice and its self- 
contacts (edges not traversed by the walk that connect two 
visited nearest-neighbor lattice sites) together with the cor- 
responding arc diagram and its non-unique minimal contact 
set (shown in bold) to be removed in order to reduce it to a 
planar diagram. 

The diagram is said to be planar if no two arcs cross 
each other. This is equivalent to having no PKs. In the 
opposite case, a selective treatment of the PKs primarily 
requires identification of their number. Since the defini- 
tion involves crossings in the arc diagram, one might be 
inclined to simply count the number of crossings. How- 
ever, this does not make physical sense, because an arbi- 
trary number of crossings may be generated by the addi- 
tion of a single contact, as is obvious from Fig^ Instead 
we need a quantity that reflects the number of contacts 
that are responsible for the pseudoknots. Accordingly, 
we define N p k, the PK number, as the minimum num- 
ber of arcs that need to be removed to reach a planar 
diagram. The same definition was recently adopted in 
another study on RNA pseudoknot prediction Q{. We 
stress that the choice of this minimal set is in general not 
unique. In FigQ] we show a SAW in two dimensions and 
the corresponding arc diagram, where one possible choice 
of a minimal set of arcs which, when removed, leave a pla- 
nar diagram is drawn in bold. Therefore, although one 
can talk about a unique PK number, labeling some of 



the contacts as PK forming contacts requires adopting 
an extra arbitrary convention. In this study, we avoid 
this by resting our results on the mere knowledge of the 
number of PKs. 

Our first observation is that calculating N p k exactly for 
an arbitrary arc diagram belongs to a class of problems 
known as NP-complete, implying that there's no known 
deterministic polynomial-time algorithm for calculating 
Npk 01- We prove this by mapping our problem to 
one of the six well-known problems in computer science 
that are shown to be NP-complcte, namely the 'vertex- 
covering' problem. For a recent review of the vertex- 
covering problem in the statistical physics context see 
|20|. The mapping is easily established by representing 
each arc by a vertex and drawing edges between pairs of 
vertices corresponding to crossing arcs (see Fig|2Jl. The 
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FIG 2: The resulting graph for the arc diagram in Fig[T]after 
the mapping to the vertex-covering problem. Empty vertices 
correspond to the solution represented by bold arcs in FigtTI 

vertex-covering problem on the obtained graph (known as 
the incompatibility graph) amounts to finding a minimal 
set of vertices which, when labelled, results in labelling 
at least one end of every edge. In other words, erasing 
those vertices alone together with the edges sprouting 
from each is sufficient to get rid of all the edges in the 
graph. Since every edge reflects a crossing and every ver- 
tex an arc in the original arc diagram, eliminating those 
arcs corresponding to the minimal vertex set obviously 
results in a planar diagram, i.e., the size of this minimal 
set is equal to N p k- Since we need to calculate N p k for 
many polymer configurations in our study, it is impor- 
tant to be aware that an exact treatment requires CPU 
times exponential in the number of arc-crossings. The 
above mapping is a different and simpler statement of 
NP-completeness than an earlier proof that a large class 
of RNA secondary structure prediction algorithms based 
on free e nerg y minimization with pseudoknots are NP- 
complete [21| . Also note that when restricted to a certain 
subset of polymer configurations (typically generated in 
an iterative manner), the problem can be solved in poly- 
nomial time. Efficient algorithms on such restricted con- 
figurations have been recently utilized to predict RNA 
pseudoknots @,H[I2. 

What is the expected value of N p k for a polymer? We 
would first like to provide some insight for the reader by 
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calculating N p k for a random arc diagram: Consider N c 
randomly chosen distinct pairs of points on a line segment 
and connect them by arcs as in FigHJj. The probability 
that m randomly placed arcs do not intersect (i.e., form 
a planar graph) is 



Pplanar(m) ~ l/(m!) 

,th 



(1) 



Proof by induction: the m non- intersecting arc to be 
placed will see the polymer divided into m compartments, 
such that, if the arc is placed with its legs resting on dif- 
ferent compartments, it has to cross at least one other 
arc. Then, the probability that the m}^ 1 arc doesn't cross 
(given the first m — 1 arcs) is roughly 1/m, i.e., the prob- 
ability that the two points fall into the same bin. Eq. Q 
follows by induction. 

Then, such a subset of m planar arcs will be found 
among available distinct subsets when 



N 



> 1/Pplanar(m) 



(2) 



m is maximized when the two sides are equal. Applying 
Sterling's approximation on both sides leads to m max oc 
y/N c - In other words, the typical number of pseudoknots 
for a random arc diagram approaches the number of con- 
tacts in the thermodynamic limit (N c — ► oo) as 

{N e - (N pk ))/N c (X 1/y/K. 

Yet, true polymers and lattice walks do not come with 
random arc diagrams. There is a considerable corre- 
lation among the contacts due to the existence of an 
underlying chain and the effective repulsion resulting 
through self-avoidance, both of which favor contacts be- 
tween monomers that are closer along the chain. This 
tendency is reflected in the loop length distribution [s^ . 



p(i) cx r c , 



(3) 



where c = d/2 (random- walk) , c = dv — a 4 = 2.68,2.22 
(SAW in 2d, 3d), and 174 the critical dimension associ- 
ated with a 4-leg vertex as in the polymer network the- 
ory of Duplantier [27| ■ Correspondingly, one expects N p k 
for the real chains to be less than the above 'random 
graph' value. The combinatorial argument presented 
above picks random pairs of points along the chain with 
equal probability irrespective of the distance in between 
(c = 0). Unfortunately, it does not generalize easily 
to c > 0. However, our numerical analysis on random 
graphs with arbitrary c suggests 

(N c -N pk ) <xN« c \ 

with q(c) increasing almost linearly from q(0) — 1/2 and 
saturating at g(~ 2) = 1. The interesting statistical 
properties of the incompatibilty graph with the distri- 
bution in Eq.|j3J) will be reported elsewhere. A further 



reduction is expected in two dimensions due to additional 
constraints imposed by the impenetrability of encircled 
regions: The fact that each polymer contact divides the 
plane to two disconnected regions translates to having 
a bipartite incompatibility graph. As a consequence, 
Npk < N c /2 (since the arc-diagram turns out to be pla- 
nar when arcs are allowed on both half-planes instead of 
one). Unlike the general case, the vertex-cover problem 
on bipartite graphs is solvable in polynomial time |2^ . 

One can obtain a lower-bound on N pk from Kesten's 
Pattern Theorem [2j|, i.e. by noting that a local pseu- 
doknotted pattern (e.g. the shortest S-shaped walk on a 
square lattice) has a finite probability of occurence in an 
infinite chain. Therefore 



(N pk ) > aN 



(4) 



for a walk of N steps and for some a > 0. Extensiv- 
ity of (Npk) for a homopolymer supports the observation 
that their exclusion may have manifestations on the na- 
ture of the transition in the thermodynamic limit |l3| 
and that penalizing pseudoknot formation can change 
the low-temperature phase from collapsed to a branched 
polymer (Section.il V0. 



III. PSEUDOKNOT NUMERICS 

In this part of our analysis we look at the ordinary 
homopolymer collapse, where the energy is not sensi- 
tive to the pseudoknot formation. We obtained statis- 
tics numerically for self-avoiding chains of typical size 
TV = 300, even though we checked for size independence 
of our results occasionally by going up to TV = 800. All 
our results were obtained by using an improved version 
of the PERM algorithm developed by Grassberger et. al. 
|24| . Although with PERM it is typical to simulate much 
longer homopolymers, our statistics were mainly limited 
by the fact that N pk for each configuration has to be cal- 
culated from scratch, unlike, e.g., the number of contacts 
which is updated incrementally at each step of the walk. 
We present results for square and cubic lattices in two 
and three dimensions, respectively. 

It is possible to calculate N p k using an exact back- 
tracking search algorithm which is straightforward to im- 
plement, but requires a runtime exponential in system 
size. Since we want to obtain statistics for reasonably 
long chains, it is not feasible to use such an exact method. 
Instead, we calculate N p k approximately by means of a 
greedy algorithm which at each step eliminates (one of) 
the maximally crossing arc(s). The choice is made ran- 
domly when they are more than one. Due to the stochas- 
tic nature of the algorithm and the fact that the optimal 
selection may involve eliminating a less than maximally 
crossing arc, this greedy algorithm provides an upper 
bound to Npk. For details of various exact algorithms 
and the above greedy algorithm we refer the reader to 

M. 
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FIG. 3: Fraction of pseudocontacts in a SAW as a function of 
temperature for different walk lengths in two dimensions. 

For comparison, we also implemented an exact calcula- 
tion of N p k on the typical diagrams we encountered. We 
found that the average deviation from the exact value 
of the upper bound on N p k obtained by the greedy al- 
gorithm, although increasing with growing chain length, 
approaches a constant fraction around 1.5% of the exact 
Npk- Therefore, we are confident that our conclusions 
concerning the scaling behavior are not effected by the 
approximate algorithm we adopted. 

At all temperature regimes, we find that (N p k) is 
a fraction of the total number of contacts with a 
temperature-dependent proportionality constant, a(T). 
Although a Fermi-function-like limiting behavior is evi- 
dent from Fig|3 the precise form of a(T) in the thermo- 
dynamic limit requires a more elaborate analysis which 
we will not attempt here. 

a(T) reveals the leading behavior in (N P k), however 
theory of critical phenomena has taught us that the non- 
trivial behavior of many systems is reflected in the non- 
analytic contribution to the extensive quantities. Re- 
cently, Orlandini et.al. [25| considered the scaling of 
contacts formed between the two halves (referred from 
here on as A and B) of a SAW at the 0-point as a direct 
and precise way of measuring the crossover exponent, <j)g. 
The number of such contacts scales as 

(N* B (T g )) K N+», (5) 

where <pg = 3/7 in two dimensions, as can be shown ana- 
lytically by using a recent extension of Saleur-Duplanticr 
results for polymer criticality [2(| • The value of the 
crossover exponent is typically difficult to confirm by nu- 
merics, because, as a rule, it has to be extracted from 



subleading singular terms, when considring the set of all 
contacts along the chain [23. Focusing on the contacts 
between the two halves strongly filters out the domi- 
nant analytical contribution of the local contacts along 
the chain, thus surfacing the otherwise concealed non- 
analyticity. AB-contact statistics has been fruitful in a 
variety of polymer models [25L |29| . 

We use the same method here to pinpoint the singular- 
ity of (Npk), which at the leading term scales identical to 
the contact energy (extensivity of (N p k) due to Kesten's 
Pattern Theorem). More precisely, we start by identify- 
ing all A-B contacts, N^ B , as above, and then calculate 
the number of A-B pseudoknots, N^ B , by eliminating 
the crossing arcs in the arc diagram corresponding to 
A-B contacts only. Since (N p k) on the whole chain is a 
fraction of the contacts, (N c ), at all temperatures, one 
may expect that (Np B ) should also scale identical to 
(N^ B ) at all temparatures. In contrast, we find a qual- 
itatively different scaling of the A-B pseudoknot number. 

Let us consider each temperature regime separately: 

Above the fl-point (T > Tg), no surprises are expected: 
(Nf B ) already saturates to a constant in both two and 
three dimensions. Since Np B < N^ B , this leaves no 
alternative to (N^jf) but to stay finite as well. This is 
confirmed by our numerics in Fig|5j 

At the 0-point, the PERM algorithm is the most ef- 
ficient. Therefore we expect our results to be most ac- 
curate in this region. Recall that (Nf B ) cx N 5 ' 7 for 
T = Tg, which can be obtained analytically and verified 
numerically to high accuracy. Our numerical results for 
(Np B ) on the other hand indicate a logarithmic diver- 
gence of the form 

(N^iTg))^ (log NT% (6) 

where to our best estimate, Vg ~ 3.9 in two dimensions 
(Fig[5}. In three dimensions, the logarithmic behavior 
survives (only at T = Tg), albeit with a different expo- 
nent Vg ~ 4.3. 

In retrospect, one could argue a priori that (N^) 
and (N^ B ) should have qualitatively different scaling 
properties at the fi'-temperature: 

We recall that the easiest way to obtain cf>g is to use 
the correspondence between a ring polymer at the 0-point 
and the full hull of a percolating cluster at the percolation 
transition in two dimensions |25| . In this picture, the AB- 
contacts correspond to the the 'red' contacts between the 
two halves that lie between two diametrically opposite 
points of the hull (which are at the opposite infinities, 
see FigEt) 0. 

The key observation is that, an AB-pseudoknot can- 
not be formed 'locally' between the two halves, because 
one of the two halves of the chain should wrap around 
the midpoint to make a pseudoknot-forming A-B con- 
tact, as shown in FigEb- This imposes a rather stringent 
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FIG. 4: The curves A(thick) and B(thin) represent the two 
halves of the full hull of a percolating cluster at the perco- 
lation transition. The number of AB-contacts (indicated by 
the cutting line segments) scales as A 3//7 in the thermody- 
namic limit when the two diametrically opposite points on 
the hull bordering A and B are taken to infinity (a). An 
AB-pseudoknot can not be described in this setting once the 
thermodynamic limit is taken (b). 



condition on the A-B pseudoknot formation. The numer- 
ical result of Eq. © nevertheless indicates a logarithmic 
divergence for (A^f). The scenario depicted in FigQJi 
suggests a likely connection to the statistics of the ho- 
mopolymer winding angle, u>, for which 



(w 2 ) oc log Y 



(7) 



in the swollen phase and at the (9-point in two dimensions 
|30|. Yet, this would have the interesting implication 
that the similar log-scaling observed in three dimensions 
has a different origin. These possibilities will be investi- 
gated in the future. 



Below the 0-point and in two dimensions, the logarith- 
mic growth of {Np^) appears to persist. Although the 
numerics in this region is not as reliable, the lack of a 
power-law behavior similar to that for (N p k) is not sur- 
prising due to the above geometric considerations. Note 
that the scaling of {N^ B ) in this regime is still power-law 
|3l| . In three dimensions and for T < Tg d , preliminary 
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calculations suggest (N L 

the logarithmic behavior in (N^P). This is probably due 
to the fully A-B co-penetrated configurations of the com- 
pact polymer in 3d. Unlike in 2d, the A-B boundary fills 
the volume. 
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FIG. 5: Scaling of the A-B pseudoknot number for T > Tg 
(2d), T — Te (2d) and T = T e (3d). Estimated asymptotic 
slopes are vg ~ 0,3.9 and 4.3, respectively. Maximum walk 
size was 800 steps in both dimensions. 



A 



e«5 

v 




1.2 

Temperature 



FIG. 6: Transition temperature for each value of a is located 
as the crossing point of (R e )/{R g ) curves as a function of 
temperature for different polymer lenghts. The curves is an 
interpolation between the data points for two dimensions and 
a = 0.3. 



IV. PSEUDOKNOT-SENSITIVE 
HOMOPOLYMER COLLAPSE 



The pseudoknot formation is essential for the collapse 
of a polymer to a compact structure. This is most easily 
seen by comparing the radius of gyration, R g , for a self- 
attracting homopolymer with that for which the pseu- 
doknot forming contacts are excluded from the energy 
calculation. Consider the following generalized Hamilto- 
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nian for a self-avoiding lattice walk: 

H = J2 e cHh3)-ae c N vk (8) 

ij 

with A(i,j) — 1 for pairs of occupied nearest-neighbor 
lattice sites not consecutive along the walk and else. 
a = 0, oo correspond respectively to the usual self- 
attracting homopolymer with and without pseudoknots. 
a = 1 is the case when the homopolymer energy is given 
by the number of contacts in the maximal planar 'sub'- 
arc-diagram. In Eq.©, we deliberately avoided writing 
down Npk as a sum over contacts, since only the number 
N p k and not the identity of the PK-forming contacts is 
well-defined. 

We located the transition point for different values of a 
by plotting (R e )/(R g ), the ratio of the averaged end-to- 
end distance and the radius of gyration vs. temperature. 
This universal ratio as a function of temperature should 
converge to a step function in the thermodynamic limit 
(N — ► oo) with a universal intermediate value at the 9- 
point §2) HH The crossing point of the curves in Fig0 
for different chain sizes is an efficient way of locating the 
transition tem per ature and the critical universal ratio at 
the transition [l3l ] . The resulting phase-diagram is given 
in Fig[7| 

A. Phase Diagram 

Typical low-temperature configurations in the limit 
a — ► oo are double-stranded branched structures as 
shown in the inset of Fig[7| In fact, one can show 
that the ground-state configurations in this limit are the 
nnn-avoiding lattice-trees. For a proof in two dimen- 
sions, it is sufficient to note that the number of ener- 
getically favorable contacts (size of the maximal planar 
sub-diagram) is maximized when the two ends of the 
walk meet at nearest-neighbor sites to form a fully de- 
flated self-avoiding ring. Thus at zero temperature, the 
Hamiltonian in Eq.JHJ with a = 1 is equivalent to the 
Leibler-Singh-Fisher (LSF) model [34[ of planar vesicles 
with negative area fugacity. LSF model with negative 
pressure is established to have a BP low-temperature 
phase. The corresponding BP lives on the dual-lattice 
points inside the ring and the self-avoidence of the ring 
translates to next-nearest-neighbor-avoiding branches in 
the dual lattice. Consistently, we numerically obtain for 
the radius-of-gyration exponent, u{T < T c ) ~ 0.62, very 
close to the value of vbp = 0.64. SAW^BP transi- 
tion was studied earlier in several lattice polymer models 
We note that the Hamiltonian in Eq.© ex- 
hibits an SAW-BP transition also in higher dimensions, 
especially in three-dimensions which could be relevant 
to RNA-folding. The zero-temperature mapping to lat- 
tice trees presented above applies to three dimensions as 
well, although the dual lattice on which the correspond- 
ing branched polymers live is not simply the shifted sim- 
ple cubic lattice. 



The low-temperature scaling of (R g ) along the line a = 
1 is still BP-like. Deep in the BP phase, e P fe = (a - 1) e c 
acts as a contact interaction between the branches (neg- 
ative a being the attractive regime). Considering ear- 
lier studies on BPs 38] , we then expect a second transi- 
tion line in each dimension between two low-temperature 
phases, BP and the collapsed polymer (CP), as shown in 
dashed-line for two dimensions only in Fig0 The sim- 
plest scenario is that the BP-CP boundary splits from 
the SAW-BP boundary at the #-point and asymptoti- 
cally approaches the a — 1 line such that e ph /T = const. 
The critical interaction for the collapse of lattice-trees 
has been extensively studied in the literature [13, H|J for 
both dimensions. We merely speculate this section of the 
phase diagram, since obtaining good statistics at temper- 
atures low enough to distinguish the two phases was not 
possible. 

An interesting feature of the phase diagram in Fig[7| 
is the crossing of the phase boundaries corresponding 
to two and three dimensions around a = 1, reflecting 
the fact that the Tg for a self-attracting homopolymer 
(a = 0) increases with increasing dimension, whereas the 
SAW— >BP transition temperature for — a ^> 1 has the 
opposite trend! The transition temperature in each case 
is determined by the interplay between the entropy of 
the coil and the energy of the collapsed state. For a — 0, 
as we switch from the square to the cubic lattice, the 
increased gain in contact energy by collapse (due to the 
higher number of nearest-neighbor sites) overshadows the 
increased loss of entropy (due to -roughly- the change in 
the connectivity constant). As a result, ltBTg/e c moves 
up from 1.5 to 3.5. In the other limit (-a ^> 1), the 
contact energy due to the partial collapse to a BP is pro- 
portional to N and independent of dimension. Yet, the 
entropy loss due to collapse still increases with dimen- 
sionality. Then, the collapse to a BP should happen at 
a lower temperature with increasing dimension. The two 
opposing trends cancel each other around a = 1. 

Also note that, positive but small values of 1 — a de- 
scribe a transition upon reducing the temperature first to 
a branched-polymer-like state followed by a PK-mediated 
collapse. Such collapse (or melting) of RNAs with an 
intermediate pronounced with lowered Mg +2 concentra- 
tion has been experimentally observed 0, 0, and 
discussed in (l5| . 



V. CONCLUSIONS 

To summarize, we attempted in this paper to provide a 
mathematical definition for the number of pseudoknots, 
Npk, in a polymer chain. With this definition, we show 
that counting the number of pseudoknots is equivalent 
to the well-known 'vertex-cover' problem which is NP- 
completc. Nevertheless, it is possible to study numeri- 
cally the statistical properties of pseudoknots by employ- 
ing an efficient approximate scheme. We show that the 
average total number of pseudoknots is extensive at all 
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SAW-BP 




FIG. 7: The phase-diagram for the Hamiltonian in Eq|H| 
The lower horizontal axis corresponds to the usual homopoly- 
mer with self-attraction. SAW-BP transition line were ob- 
tained from the numerical data by the universal-ratio crossing 
method described in the text. The terminal point on the lower 
axis of the transition curves correspond to the S-point for that 
dimension. Dashed curve is the expected BP-CP boundary 
with a limiting (T-+0) value of (1 ~a)/T = 0.69. 



temperatures, however the number of pseudoknots form- 
ing between the two-halves of the chain scales logarith- 
mically with chain size at the 0-point of a homopolymer 
in both two and three dimensions, and also for T < Tg in 
two dimensions. This logarithmic character is likely be 
related to the winding-angle statistics in two dimensions. 

We also studied the role of pseudoknots in the ho- 
mopolymer collapse by considering a Hamiltonian which 
favors polymer self-contacts but penalizes pseudoknots. 
We showed that in the absence of an energetic preference 
for pseudoknot formation, the low-temperature phase is 
a branchcd-polymer. When the ratio of two competing 
energies satisfies < a < 1, Hamiltonian (JSJ) allows a 
transition scenario from a SAW to a collapsed phase with 
an intermediate branched polymer regime, where the BP- 
CP transition is mediated by pseudoknot formation. The 
critical properties of the phase diagram and relevance to 
RNA folding need further investigation. 



VI. ACKNOWLEDGEMENTS 

A.K. thanks D. Yuret and C.E. Soteros for pointing 
out, respectively, the NP-complete character of the prob- 
lem and the existence of a polynomial-time solution on 
bipartite graphs. We acknowledge support from INFM- 
PA02 and MIUR-COFIN01. 



[1] E.B. ten Dam, C.W.A. Pleij, and D. Draper, Biochem- 
istry 31, 11665 (1992); C.W.A. Pleij, Curr. Opin. Struc. 
Biol. 4, 337 (1994). 

[2] A. Xayaphoummine, T. Bucher, F. Thalmann, and H. 
Isambert, cond-mat 0309117 

[3] D.S. McPheeters, G.D. Stormo, and L. GOld, J. Mol. 
Biol. 201, 517 (1990). 

[4] Z. Du, J.A. Holland, M.R. Hansen, D.P. Giedroc, and 
D.W. Hoffman, J. Mol. Biol. 271, 463 (1997). 

[5] N.M. Wills, R.F. Gesteland, and J.F. Atkins, Proc. Natl. 
Acad. Sci. 88, 6991 (1991). 

[6] P.J. Farabaugh, Cell 74, 591 (1993). 

[7] R. Nussinov and A.B. Jacobson, Proc. Nat. Acad. Sci. 
77, 6309 (1980). 

[8] M. Zuker and P. Stiegler, Much Acids Res. 9, 133 (1981). 

[9] R. Bundschuh and T. Hwa, Phys. Rev. Lett. 83, 1479 
(1999). 

[10] E. Rivas, S. Eddy, J. Mol. Biol. 285, 2053 (1999). 

[11] Y. Uemura, A. Hasegawa, S. Kobayashi, and T. Yoko- 

mori, Theo. Comp. Sci. 210, 277 (1999). 
[12] M. Pilsbury, J.A Taylor, H. Orland, and A. Zee, 

|cond-mat /0310505 
[13] M. Baiesi, E. Orlandini, and A.L. Stella, Phys. Rev. Lett 

87, 070602 (2003). 
[14] A. Lucas and K.A. Dill, J. Chem. Phys. 119, 2414 (2003). 
[15] P. Leoni and C. Vanderzande, Phys. Rev. E. 68, 051904 

(2003). 

[16] There is experimental evidence that these regimes can 
be reached by modifying the Mg +2 concentration in the 



solution. E.g., see V.P. Misra and D.E. Draper, Biopoly- 
mers 48, 113 (1999). 
[17] I. Tinoco Jr. and C. Bustamante, J. Mol. Biol. 293, 271 

(1999) . 

[18] M. Vendruscolo, B. Subramanian, I. Kanter, E. Domany, 
and J. Lebowitz, Phys. Rev. E 59, 977 (1999). 

[19] M.R. Carey and D.S. Johnson, Computers and In- 
tractability: A Guide to the Theory of NP-Completeness, 
W.H. Freeman & Co. (1979). 

[20] M. Weigt and A.K. Hartmann, Phys. Rev. E 63, 056127 
(2001). 

[21] R.B. Lyngso and C.N.S. Pedersen, J. Comp. Biol. 7, 409 

(2000) . 

[22] G. Chartrand and O. Oellermann, "Applied and Algo- 
rithmic Graph Theory", McGraw-Hill (1992). 

[23] N. Madras, G. Slade, The Self-Avoiding Walk, 
Birkhauser Press (1996). 

[24] H.-P. Hsu, V. Mehra, W. Nadler, and P. Grassberger, J. 
Chem. Phys 118, 444 (2003). 

[25] E. Orlandini, F. Seno, A.L. Stella, Phys. Rev. Lett. 84, 
294 (2000). 

[26] M. Aizenman, B. Duplantier, and A. Aharony, Phys. Rev. 
Lett. 83, 1359 (1999). 

[27] B. Duplantier J. Stat. Phys. 54, 581 (1988). 

[28] The connection between the crossover exponent and the 
sub-dominant factor in the free energy was recently em- 
phasized to explain certain numerical results. See A.L. 
Owczarek, T. Prellberg, Phys. Rev. E 67, 032801 (2003). 

[29] M. Baiesi, E. Carlon, E. Orlandini, A.L. Stella, Phys. 



8 



Rev. E. 63, 041801 (2001). 

[30] B. Drossel and M. Kardar, Phys. Rev. E 53, 5861 (1996). 

[31] A naive argument based on a mapping of the low- 
temperature AB-contacts to high-temperature SAW sug- 
gests an exponent <f> = 2/3 (unpublished). 

[32] C. Vanderzande, Lattice Models of Polymers, Cambridge 
University Press (1996). 

[33] V. Privman, P.C. Hohenberg, and A. Aharony in Phase 
Transitions and Critical Phenomena, edited by C. Domb 
and J.L. Lebowitz, Academic Press, New York (1991), 
vol. 14 p. 1. 

[34] S. Leibler, R.R.P. Singh, and M.E. Fisher, Phys. Rev. 
Lett. 59, 1989 (1987). 



[35] E. Orlandini, F. Seno, A.L. Stella, and M.C. Tesi, Phys. 

Rev. Lett. 68, 488 (1992). 
[36] R. Dekeyser, E. Orlandini, A.L. Stella, and M.C. Tesi, 

Phys. Rev. E 52, 5214 (1995). 
[37] A.L. Stella, Phys. Rev. E 50, 3259 (1994). 
[38] F. Seno and C. Vanderzande, J. Phys. A 27, 5813 (1994). 
[39] N. Madras, E.J. Janse van Rensburg, J. Stat. Phys. 86, 

1 (1997). 

[40] D.K. Treiber, M.S. Rook, P.P. Zarrinkar, and J.R. 

Williamson, Science 279, 1943 (1998). 
[41] J. Pan and S.A. Woodson, J. Mol. Biol. 280, 597 (1998). 



